Goto

Collaborating Authors

 rca method


Online Multi-modal Root Cause Analysis

arXiv.org Artificial Intelligence

Root Cause Analysis (RCA) is essential for pinpointing the root causes of failures in microservice systems. Traditional data-driven RCA methods are typically limited to offline applications due to high computational demands, and existing online RCA methods handle only single-modal data, overlooking complex interactions in multi-modal systems. In this paper, we introduce OCEAN, a novel online multi-modal causal structure learning method for root cause localization. OCEAN employs a dilated convolutional neural network to capture long-term temporal dependencies and graph neural networks to learn causal relationships among system entities and key performance indicators. We further design a multi-factor attention mechanism to analyze and reassess the relationships among different metrics and log indicators/attributes for enhanced online causal graph learning. Additionally, a contrastive mutual information maximization-based graph fusion module is developed to effectively model the relationships across various modalities. Extensive experiments on three real-world datasets demonstrate the effectiveness and efficiency of our proposed method. Root Cause Analysis (RCA) is crucial for identifying the underlying causes of system failures and ensuring the high performance of microservice systems (Wang et al., 2023a; Li et al., 2021; Wang et al., 2023c).


LEMMA-RCA: A Large Multi-modal Multi-domain Dataset for Root Cause Analysis

arXiv.org Artificial Intelligence

Root cause analysis (RCA) is crucial for enhancing the reliability and performance of complex systems. However, progress in this field has been hindered by the lack of large-scale, open-source datasets tailored for RCA. To bridge this gap, we introduce LEMMA-RCA, a large dataset designed for diverse RCA tasks across multiple domains and modalities. LEMMA-RCA features various real-world fault scenarios from IT and OT operation systems, encompassing microservices, water distribution, and water treatment systems, with hundreds of system entities involved. We evaluate the quality of LEMMA-RCA by testing the performance of eight baseline methods on this dataset under various settings, including offline and online modes as well as single and multiple modalities. Our experimental results demonstrate the high quality of LEMMA-RCA. The dataset is publicly available at https://lemma-rca.github.io/.


AI Empowered Net-RCA for 6G

arXiv.org Artificial Intelligence

In order to realize the vision of connecting everything worldwide, the sixthgeneration (6G) wireless networks are receiving unprecedented attention, and are anticipated to build a bridge to the smart society of the future. Compared with 5G, 6G is expected to boost network spectrum efficiency, offer massive access, improved reliability, and latency, as shown in Table 1. Consequently, future 6G networks are expected to be able to support wireless connections of various emerging applications and massive intelligent devices e.g., extended reality (XR) services, telemedicine and brain-computer interfaces, and deliver low latency and high data rates for different heterogeneous devices. We illustrate the key emerging 6G applications in Fig.1. Specifically, the application types of 6G can be classified as MBRLLC, mURLLC, HCS and MPS [8]. The most representative 6G use cases are presented as follow, the network requirements are shown in Table 2. Holographic Type Communication (HTC): HTC is expected to deliver 3D images from one or multiple source nodes to different destinations. Owing to the extremely large data for recording and reconstructing, HTC requires bandwidth up to Tbps level for transmission.